AITopics | multiple descent

Collaborating Authors

multiple descent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multiple Descent: Design Your Own Generalization Curve

Neural Information Processing SystemsDec-24-2025, 02:06:40 GMT

This paper explores the generalization loss of linear regression in variably parameterized families of models, both under-parameterized and over-parameterized. We show that the generalization curve can have an arbitrary number of peaks, and moreover, the locations of those peaks can be explicitly controlled. Our results highlight the fact that both the classical U-shaped generalization curve and the recently observed double descent curve are not intrinsic properties of the model family. Instead, their emergence is due to the interaction between the properties of the data and the inductive biases of learning algorithms.

multiple descent, name change, own generalization curve, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Wei, Wenbo, Le, Nicholas Chong Jia, Lai, Choy Heng, Feng, Ling

arXiv.org Artificial IntelligenceMay-27-2025

In deep learning, understanding the training dynamics has become paramount for enhancing model performance, generalization, and robustness. The training of deep neural networks involves navigating through complex, high-dimensional parameter spaces, where the interplay between model complexity, dataset characteristics, and learning algorithms dictates the learning trajectory. This process is far from straightforward, often characterized by phenomena such as overfitting, under-fitting, and various forms of descent in performance metrics. The dynamics of training deep neural networks are critical for several reasons. Generalization is a primary concern in machine learning, focusing on the model's ability to generalize from training data to unseen data.

artificial intelligence, machine learning, transition, (15 more...)

arXiv.org Artificial Intelligence

2505.2003

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multiple Descent: Design Your Own Generalization Curve

Neural Information Processing SystemsOct-10-2024, 06:23:46 GMT

multiple descent, own generalization curve

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Learning Curves for Sequential Training of Neural Networks: Self-Knowledge Transfer and Forgetting

Karakida, Ryo, Akaho, Shotaro

arXiv.org Machine LearningDec-2-2021

Sequential training from task to task is becoming one of the major objects in deep learning applications such as continual learning and transfer learning. Nevertheless, it remains unclear under what conditions the trained model's performance improves or deteriorates. To deepen our understanding of sequential training, this study provides a theoretical analysis of generalization performance in a solvable case of continual learning. We consider neural networks in the neural tangent kernel (NTK) regime that continually learn target functions from task to task, and investigate the generalization by using an established statistical mechanical analysis of kernel ridge-less regression. We first show characteristic transitions from positive to negative transfer. More similar targets above a specific critical value can achieve positive knowledge transfer for the subsequent task while catastrophic forgetting occurs even with very similar targets. Next, we investigate a variant of continual learning where the model learns the same target function in multiple tasks. Even for the same target, the trained model shows some transfer and forgetting depending on the sample size of each task. We can guarantee that the generalization error monotonically decreases from task to task for equal sample sizes while unbalanced sample sizes deteriorate the generalization. We respectively refer to these improvement and deterioration as self-knowledge transfer and forgetting, and empirically confirm them in realistic training of deep neural networks as well.

generalization error, sample size, sequential training, (14 more...)

arXiv.org Machine Learning

2112.01653

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback